A parallel Clustering algorithm based on minimum spanning tree for microarrays data analysis

نویسندگان

  • DINA ELSAYAD
  • AMAL KHALIFA
  • ESSAM KHALIFA
چکیده

Clustering is partitioning a set of observation into groups called clusters, where the observation in the same group has a common characteristic. One of the best known algorithms for solving the microarrays data clustering problem using minimum spanning tree (MST) is CLUMP algorithm (Clustering algorithm through MST in Parallel) which identifies a dense clusters in a noisy background. The MST construction phase of the CLUMP is the time consuming phase. This paper presents an improved version of CLUMP algorithm called iCLUMP (improved Clustering algorithm through MST in Parallel). iCLUMP enhances the speedup of MST construction using the cover tree data structure. The implementation shows that iCLUMP is efficient than CLUMP in terms of complexity and runtime. Key-Words: Clustering; Minimum spanning tree; Microarrays; Bioinformatics, Parallel algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

hiCLUMP : A hybrid Implementation of the CLUMP Algorithm for Clustering Microarrays Data

Microarrays technology allows us to measure the expression level of hundreds of thousands of genes simultaneously. The microarrays data analysis process involves various heavy computational tasks such as clustering. The clustering can be defined as partitioning a dataset into groups where objects in the same group are similar in somehow. CLUMP (clustering through MST in parallel) is one of the ...

متن کامل

An Efficient Parallel Data Clustering Algorithm Using Isoperimetric Number of Trees

We propose a parallel graph-based data clustering algorithm using CUDA GPU, based on exact clustering of the minimum spanning tree in terms of a minimum isoperimetric criteria. We also provide a comparative performance analysis of our algorithm with other related ones which demonstrates the general superiority of this parallel algorithm over other competing algorithms in terms of accuracy and s...

متن کامل

Classification of encrypted traffic for applications based on statistical features

Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...

متن کامل

An Adaptive Parallel Hierarchical Clustering Algorithm

Clustering of data has numerous applications and has been studied extensively. It is very important in Bioinformatics and data mining. Though many parallel algorithms have been designed, most of algorithms use the CRCW-PRAM or CREW-PRAM models of computing. This paper proposed a parallel EREW deterministic algorithm for hierarchical clustering. Based on algorithms of complete graph and Euclidea...

متن کامل

A Metaheuristic Algorithm for the Minimum Routing Cost Spanning Tree Problem

The routing cost of a spanning tree in a weighted and connected graph is defined as the total length of paths between all pairs of vertices. The objective of the minimum routing cost spanning tree problem is to find a spanning tree such that its routing cost is minimum. This is an NP-Hard problem that we present a GRASP with path-relinking metaheuristic algorithm for it. GRASP is a multi-start ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013